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1 Introduction. 


In the present paper we derive and study estimates for the Kantorovich functionals between probability 
solutions for the linear Fokker - Planck - Kolmogorov (FPK) equations for probability measures fit and a t 
on R. d , t G [0,T], with different drifts and different initial conditions 

dtut = tia,ce(Q(x,t)D 2 p t ) - div(B 1 (x,t)p t ), /t|t=o = Po 

d t o t = tia,ce(Q(x,t)D 2 a t ) - div(B 2 (x,t)<j t ), er|t=o = cr 0 . 

We also show an alternative method to the study of well-posedness and stability of solutions to the nonlinear 

FPK equations 

d t pt = trac e(Q(x, t)D 2 p t ) - di v(B(p, x, t)p t ), p\ t =o = po, (1) 

based on such estimates for linear equations. 

Recently, FPK equations have been actively studied from the functional-analytical, variational and as well 
from the probabilistic point of view. Interesting connections between approaches have been found (a survey 
of the current state of studies is provided in ED- Estimates connecting distances between solutions with 
distances between initial data and even coefficients play a great role not only for the study of such qualitative 
properties of solutions as uniqueness or stability, but also for numerical simulations. In this context estimates 
for distances between solutions to equations with different drift terms are particularly interesting. 

In Section 1 we derive estimates for the Kantorovich functionals between solutions of FPK equations with 
different dissipative drifts. To do this, we partially use ideas from m Since these ideas can not be 


1 


directly applied neither in the case of different drifts nor to nonlinear equations, new methods and ideas 
should be used. Extension to these cases has been done for the Kantorovich functionals with bounded cost 
functions. Moreover, we admit time-dependent coefficients and a non-unit diffusion matrix Q. We note 
that the requirement of dissipativity is not really restrictive - in most physical examples, the drift term is 
a minus gradient of a convex function, i.e. dissipative. Section 2 is concerned with applications of these 
estimates to the study of the well-posedness of the Cauchy problem for the nonlinear FPK equation. Well- 
posedness for the nonlinear equations has been studied by many authors even in a more general setting (see, 
for example, mini Emm). However we present an alternative approach to this problem that is applicable in 
case of dissipative drifts. A similar method of treating well-posedness via estimates for the distances between 
solutions to linear equations was used in [5]. 


Let us introduce some notation and give basic definitions. By Co°(M d ) and Cg°(R d x (0,T)) we denote 
classes of infinitely smooth compactly supported functions on and x (0, T) respectively. For shortness 
of notation we shall always drop the subscript when integrating over the whole space. We shall say 
that a measure p on x [0,T] is given by a family of probability measures (pt)te.[o.T\ on R d (and write 
p(dxdt) = p t (dx ) dt or simply p = ptdt), if p t > 0, p t (R d ) = 1, for each Borel set U the function t H► pt(U) 
is measurable and 




(j> dpt dt 


x (0, T)). 


Given a probability measure po on a symmetric Borel matrix Q(x,t) and a Borel mapping B{x,t) : 
x [0, T] —> R d , consider the following Cauchy problem for the linear FPK equation 


d t pt = trace(Q(a;, t)D 2 p t ) - div(B(ir, t)p t ), p | t=0 = Po- 


( 2 ) 


Here D 2 denotes the Hessian matrix with respect to the spacial variables. Denote the elements of the diffusion 
matrix Q(x,t) by q^(x,t), 1 < i,j < d and the elements of the vector drift B[x,t) by ly>(x,t), 1 < j < d. 
Set 

L<j> = q lJ (x,t)d 2 iXj cp + b l (x,t)d Xi (l>, 

where summation over all repeated indices is taken. We shall say that a measure p(dxdt) = p t (dx) dt is a 
solution to the Cauchy problem ([2|), if the mappings q l i(x,t), b l (x,t ), 1 < i,j < d , are Borel and belong to 
L 1 {p 1 U x [0,T]) for each ball U C K d , and for each test function ip G we have 


ipdpt = / <p dp 0 + 


Lip dp s ds 


(3) 


for all t G [0,T], Sometimes it is more convenient to use an equivalent definition (see [4]), more precisely, 
the identity 


J 4>{x,t) dpt = J 4>(x : 0) dpo + J J [ds4> + L(/)]dp s ds, 


(4) 

for all t G [0,T] and all test functions ij) G C 2,1 (K d x [0,T)) p|C(K d x [0, T]) that are identically zero outside 
some ball U C K d . If we know a priori that the drift term B is integrable over x [0,T] with respect to 
the measure dp s ds and <f> is supported on the whole but has two continuous bounded derivatives, then 
also holds true for such (j) (to show this, it sufficies to use a standard truncation argument). 


2 Estimates for the Kantorovich functionals between solutions to 
linear equations with different drifts 


In this section, we shall focus on two solutions of the linear FPK equation with different initial conditions 
and different drifts. Fix T > 0. Given probability measures po and cto on R d , a symmetric Borel matrix 
Q(x,t) and Borel mappings B^, B a : x [0,T] —> consider two Cauchy problems 

d t p t = ti'a,ce(Q(x,t)D 2 p t ) — dw(B^(x,t)p t ), p\t=o = Mo 

d t (Jt = trace(Q(j;, t)D 2 a t ) — div(H cr (x, t)a t ), er|t=o = cr 0 . 


( 5 ) 


We emphasize that the indices p and a in the drift coefficients are merely used to distinguish the different 
drifts (by marking corresponding solutions), and it is not necessary to define B as a map on a space of 
measures. 

Given a monotone nonnegative continuous function h on R with h( 0) = 0, introduce the Kantorovich h-cost 
functional between the probability measures p and er by 

C h (n,a):= inf [ h(\x - y\)dn(x,y), (6) 

Trento) J^d xR d 

where II(/z, a) is the set of couplings between p and a. Recall that a probability measures n on R d x R d 
belongs to II(/r, cr) iff n(E x R d ) = p(E), 7r(R d x E) = cr(E) for each Borel set E C R d . If h is a concave 
function with h[r) > 0 for r > 0, then Ch defines a distance on the space of probability measures and turns 
it into a complete metric space with topology that coincides with the usual weak one (see [J Proposition 
7.1.5]). Another important example is given by h(r ) = min{|r| p , 1} for some p > 1. In this case C]/ v turns 
the space of probability measures into a complete metric space. Moreover, convergence with respect to this 
metric is equivalent to the weak convergence (see [B] Th. 1.1.9]). 

Further we assume that a monotone non-decreasing continuous bounded cost function h with h( 0) = 0 is 
fixed. Set H^Hoo := sup zgR d h(\z\) < oo. 

Throughout the paper we assume that the following regularity condition holds: 

(Al) The diffusion matrix Q(x,t) has uniformly bounded elements with uniformly bounded first derivatives. 
Moreover, it is strictly elliptic: there exists v > 0 such that V(x,t) elG [0,T] 

(Q(x,t)y,y) > v\y\ 2 Vy e R d . (7) 

Theorem 2.1. Let (Al) hold. Let (pt)te[o,T] an d ( cr i)te[o,T] solutions to ([5]) with initial conditions po 
and Co respectively. Suppose that the drift term B M is X-dissipative in x, i.e. 

{B^(x, t) - B^y, t), x-y) < A||x - y\\ 2 (8) 

for all x, y € R d and all t £ [0, T]. Let 

B fl (x,t) — \x, B a (x, t) — Xx £ L 2 (R d x [0, T], d(p s + a s )ds) (9) 


Then 


Ch xt (lM,crt) < Ch{po,(ro) + 


,-i| 


Bf, - B a \ 2 da s ds\ l + / \v 1 \B I1 - B a \ 2 da s ds, (10) 


for all t £ [O,! 1 ], where h s (r) := h{re s ). 

Remark 2.1. The bound © is obviously asymmetric in measure: we impose dissipativity on B^, and the 
integration in the right-hand side is taken over a. This property might be interesting from the point of view 
of possible numerical simulations. Indeed, if we want to solve a FPK equation 

d t pt = trace{Q{x,t)D 2 n t ) — div(B{x,t)n t ) 


with a dissipative drift B, we can approximate the drift with “better” drifts B n and solve FPK equations with 
those drifts. Then © controls the distance between the desired solution and the approximative solution 
pdf in terms of the distance between drifts integrated over the known solution pd 1 . 


Proof. Let (Mt)te[o,T] an d (cOtelo.T] satisfy the assumptions of the theorem. By virtue of [2] measures p t 
and at have strictly positive densities with respect to Lebesgue measure on R d for each t £ [0,T]. We split 
the proof of (fTT)li into several steps. 




Step 1. Reduction to the dissipative case (A = 0). We rescale the problem, keeping the cost function 
unchanged, in order to reduce the problem to the case of a dissipative drift B To this aim we use the 
rescaling procedure from m with the opposite sign (since our drift term and the drift term in the cited 
work have the opposive signs). For completeness, we provide the rescaling procedure: for A ^ 0 define the 
change of time 


s(t) := f e 
Jo 


-2A r 


dr = 


1 — e 


-2A t 


-, t(s) = 


— ln(l — 2As) 


, s £ [0, Sqo ), 


2A 2A 

where Soo = +oc for A < 0 and Soo = 1/(2A) for A > 0. For measures p t and at introduce their rescaled 
versions p% and p°: for each Borel set E C R d define p™(E) := wp s )(e xt ^E) for w = p, a. We notice 
that Ch(Ps,Pg) = Ch xt (pt,o't)- Since B^ is A-dissipative, := B^ — XI is dissipative. Define the rescaled 
diffusion coefficient by 

Q(y,s) :=Q(t(s),e M ^y) 

and the rescaled drifts by 

B w {y, s ) := e At(s) R u ,(t(s), e xt{s) y), A w (y, s ) := e At(s) A w (t(s), e A * (s) y), w = p,a. 

Note that A M is also a dissipative operator. The measure p = p t dt is a solution to 

d t p t = tr&ce(QD 2 p t ) - di v(B^p t ) 
iff the rescaled version p^ = p^dt is a solution to 


d t pt = trac e(QD 2 p?) - div(A M p^)' 


( 11 ) 


moreover, © holds true iff for all nonnegative si < S 2 < S(T) < one has 



\A fl (x, s)\ 2 dp^ds < +oo. 


The integrability statement follows immediately from the change of variables formula, the identity m can 
be checked explicitly: it sufficies to consider the change of variables X(a:, t) := ( e~ xt x , s(t)) and calculate the 
derivatives. Similar statement holds for a and p a . This means that it is sufficient to prove (fT(Tl) only in the 
case A = 0, i.e. in the case of a dissipative drift term B ll . 


Step 2. Approximation of the drift term. We construct a family of smooth (in both variables) bounded 
Lipschitz (as functions of x) dissipative operators A E k (x, t ), approximating the dissipative drift term B^x, t). 

For each t the operator i? M (-, t) can be approximated by Lipschitz in x bounded dissipative operators A&(-, t) 
with bounded first order derivatives with respect to the spacial variables (see [151 Th. 2.4, 2.5]): for each 
*€[0,11 

lim Ak{x,t) = B ll (x,t ) for a.e. (x) £ and sup \Ak{x,t)\ < k + 1. (12) 

fc->oo x6R d 

Fix some non-negative function p £ Co°([0,T]) such that ||/?||li([o,t]) = 1- Introduce a family of mollifiers 
r) e {t) '■= v(t/ e )- Since for each k the mapping Ak(x,t) is bounded, the mappings A k (x,t) := p e (t) * Ak(x,t) 
have bounded derivatives of all orders with respect to t and converge to Ak[x,t) as e — > 0 for a.e. (x,t) £ 
R d x [0,T]. Notice that A\ also have bounded first order derivatives with respect to the spacial variables. 
Moreover, A e k are dissipative in x: 

{A%{x,t ) - A%{y,t),x- y) = p e (t) * ( A k (x,t ) - A k {y,t),x-y) < 0. 

Finally, we define operators C e k as follows: for (x, s) £ M d x [0, T] 

C k [<j>](x,s) := trace(Q(a;, s)D 2 (j>{x, s)) + (A E k (x, s), V x <j){x, s)}, £ C 2 (R d ). 


Step 3. Reduction of the class of test functions. It is well-known (for example, [ T51 Th. 1.3]) that 
the problem ([6]) admits a dual formulation: define the class as 

■■= {(</>, V>) e L 1 (p) x LV) : (j)(x) + il)(y) < h(\x - y\)}. 




Hence 


Ch (/i, cr) = sup / (pdg + / ip da. 


(13) 


An important observation m Lemma 2.3]) is that in the case of a bounded cost function h the supremum 
in the dual problem (1131) may be taken over a smaller class of functions & 5 h for any <5 > 0, where 


$ s h := $ h D Co°(M d ) D {(<p,ip) : inf V’ > -8 and sup ip < ||h||oo}. 


(14) 


The proof is based on the fact that functions ip and ip can be shifted by different constants and truncated in 
such way that the new pair (ipo, ipo) is still admissible and the value in (1131) doesn’t decrease: 

/ ipodp + / ipoda > / pdfj, + / ipda and inf'i/o = 0) sup^o < Halloo, supy>o > 0. 

J J J R d R d Rd 

If we want to deal with smooth compactly supported functions, then the bounds get a bit worse and lead to 
the class (EH- 

In the sequel we shall take the supremum in (fl3l) over the class & b h of admissible pairs of C“(K“)-functions 
such that Moo < llhlloo. 


Step 4. Adjoint problem. Fix an admissible pair (<p,ip) £ The smootheness of operators A k imply 
[Si Th. 3.2.1] that the following adjoint problems have solutions g. f £ x [0,t]): 

d s g + JZ e k g = 0, g(-,t) = <p( ■) and d s f + C%f = 0, f(-,t) = ip(-) (15) 

First, due to the maximum principle HI Th. 3.1.1] 

sup \g\ < sup \<p\, sup |/| < sup|# (16) 

R d X [0,t] R d R d X [0,t] R d 

Let us derive bounds for |Vg| and |V/|. The method of doing this is inspired by the Bernstein estimates. 
Denote for shortness A k := (a 1 ,..., a d ). Set v(x, t ) := \S7 g\ 2 + ng 2 — t, where n will be chosen below. Explicit 
computation gives (summation over all repeated indices is assumed) 

(ds ~ C%)v ^ 2 d Xk g{d Xk q^dl iX .g + d Xk a l d Xi g) - 2 d 2 XkXi gd 2 XkXj g - 2 ^d Xi gd Xj g - 1. (17) 

Due to dissipativity, DA k defines a negative quadratic form and 

d Xk gd Xk a l d Xi g = (DA%Vg, Vg) < 0. 

By virtue of this observation, © and the Cauchy inequality 2 ab < ca 2 + c 1 b 2 with c = 2u, (13 is dominated 

by 

wc -1 |Vgf +c^2(dl iXj g) - 2v'Y^{dl iXj g) - 2w|V.g| 2 - 1 = H|V.g| 2 - 2vn\S7g\ 2 - 1, 
i,j i,j 

where w := 2 n\ax.{\d Xk q l3 \} and fl := ui ■ (2 1 ') -1 depend only on the diffusion matrix and not on the drift. 
Choosing n := H • (2zx) —1 , we get 

(d s - £%)v < -1. 

Therefore the maximum principle [Hi . Th. 3.1.1] ensures 

max |d| < max \v(x, 0)| = max | Vd>| 2 + k max |<A| 2 , 

R d x [0,t] R d R d R d 


sup |Vg(a:,s)| < (max|V</>| 2 + k max \(p\' 1 ) 1 ^ 2 =: C\. 

R d X [0,t] Rd Rd 


hence 


( 18 ) 


Similarly 


( 19 ) 


sup |V/(x,s)| < (max|V ?/>| 2 + k max M 2 ) 1 ^ 2 +t=: C 2 . 

R rf x[0,t] Rd Rd 


Set l := C\ + C 2 . 


Finally, let us prove the crucial assertion: if the pair is admissible, and / and g solve (1151) . then 

g(x, 0) + f(y , 0) < h(\x — y |). In the case Q = I it was proved in [13} Th.3.1], In the general case the proof 
almost repeats the case Q = I, but we sketch it for completeness. By approximating h from above, we can 
assume without lack of generality that h £ C 1 (K). Define H(yi,y 2 ) := h(\yi — y 2 \) and 

H\yi-V2\) if / 

o<ayi,V2) = ay2,yi) = { i lt yi ^ V2 . 

0 113/1 = 2/2 

First assume that 

d s g + C e k g > 0, d s f + C e k f > 0. 

Suppose that C(z/i^ 2/2 , s) := < 7 ( 2/1 , s ) + f(y 2 , s) — H(y\, y 2 ) attains a local maximum at (Yi, Y 2 ,S) and S < t. 
Then d s aY u Y 2 , S) = d s g(Y u S) + d s f(Y 2 , S) < 0, 

V yi C(Yi,F 2 ,S) = V y2 ((Y u Y 2 ,S) = V B1 g(Y u S) = -V Va f(Y 2 ,S)=S(y 1 ,Y 2 )(y 1 -Y 2 ) 

and, due to dissipativity, 

A%(Y 1 ,S)V Vl g(Y 1 ,S) + A%(Y 2 ,S)V ya f(Y 2 ,S) = £(Y 1; Y 2 )(A|(F 1 ,S) - A|(Y 2 ,S),Yi - Y 2 ) < 0 . 

Since C(^i + z, Y 2 + z,S ) as a function of z has a local maximum at z = 0 and Q is positive definite, 
trac eQD 2 ( = traceQ(Y 1; S)D 2 g + trac eQ(Y 2 , S)D 2 f < 0, where 


Q(yi,y2,s) :=(® {yi) 0 


Q{y2) /' 

Summing all up, we get 

(d s g + C\g) + ( d s f + C\f) < 0; 

this contradiction means that the local maximum can be attained only at S = t. Now we proceed to the 
equality. Setting for some e, <5 > 0 

St,<5(3/1, s) := 0(3/1, s) - S(t — s) — £e -s |yi| 2 , f e ,s(y2,s ) := f(y 2 ,s) - 6 {t - s ) - ee~ s \y 2 \ 2 

and computing d s + C k , we come to the previous case for e,S small enough (since all the coefficients of the 
differential operator are bounded). Passing to the limit as e, <5 —> 0, we come to the required assertion. 

Step 5. Deriving the estimate-1. Plugging solutions of (fT5l) into identity (01) . we get 
j Wv-t - j g(x, 0 )dy 0 = - j J ( A%(x , s) - B^) ■ V.g(x, s)dy s ds , 


da t - 


J f(x,0)da 0 


{A k {x, s) - B a ) • V/(x, s)da s ds. 


Because of (1131) 



Summing up these inequalities, we get 


J <f>dfj,t + J ipda t < J g{x,0)d^ o + J f(x,0)da o + l-R E k + j J \B^ - B a \■ \X7 f\da s ds, 


where 


K ~ 


I A% - B^dins + a s )ds. 


By virtue of step 4 we have g(x, 0) + f(y, 0) < h{\x — y\). Thus 

J g(x,0)dg, Q + J f(x,0)da o < C h {fj, 0 ,a 0 ). 


So we get 


J cf)dgt + J ^da t < C h ( /U 0 , cr 0 ) + l ■ R e k + J J \B^ - B a \ ■ \Vf\da s ds. 


( 20 ) 

( 21 ) 


Step 6. Integral bound for V/. The last term in the right-hand side of m is dominated by 

1 2 da s ds. 


If 


v ^B^- B a \ 2 da s ds 
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To estimate the second multiplier, i.e. || \/QV/|| L 2 ( R d x [Q T ]. dcr ds y note that f is a function of the class 
C'j’ 1 (R d x [0,i)) D C(R d x [0,T]) and it can be plugged into the identity (H]) for the measure a: 

J ip 2 da t - J f{x, 0)dcr o = J J {d s + L a )fda s ds 

= J J 2 fdj + trace(< 5 L> 2 /) + {B a , V/)) + 2\S7 f\ 2 da s ds 

= ~J J 2f(Ai-B a ,S7f)+2\S7f\ 2 da s ds. 


Hence 


2 J fm 2 d* s ds< J ifda t - J fix,0)da 0 + 2 max \fix,t)\ J Ju 1/2 \A e k - B a \ ■ \fQW f\da s ds. 

The maximum principle (I1GD and definition m imply max|/(ir,f)| < max|'i/;(x)| < Halloo. Taking into 
account the Cauchy inequality ab < 2 _1 q a 2 + ( 27 ) - 1 6 2 with 7 = ||h||oo, we come to 


2 J J | X/f\ 2 da s ds< 


IL- 


IL ^- 1 


\A% - B a \ 2 da s ds + / / | fQ'V f\ 2 da s ds. 


Cancelling alike terms and recalling m, we get 


J <t>dlH + J ipdat < C h ig, 0 , a 0 ) + l- R% + \\h\\ooV 1/2 ' r k ' ]j^ 


| B„-B a \ 2 da s ds, (22) 


where 


rt:=\ l + v 


-1 


\A e k - B a \ 2 da s ds. 


Step 7. Limits as e —> 0 and k 00 . Deriving the estimate-2. First of all, we recall that 

Afx, t) -9 Afx, t) for a.e. (x, t) £ R d x [0, T ], 






and the measure da s ds and d(/i s T <J s )ds have strictly positive densities on x [0,T] with respect to the 
Lebesgue measure. Thus 


A £ k (x : t) —> A k {x,t) da s ds-&.e. and d(/i s + a s )ds-&.e. 

Since for each k the mappings A E k and A k are bounded, Lebesgue’s dominated convergence theorem yields 

J f\A % -B^da s ds^ j J \A k - B a \ 2 da s ds, e ->■ 0 , 

a \A e k - B^d^s T cr s )ds J J \A k -B ll \d(n s + a s )ds, e ->• 0 . 


Next, recall that 


lim A k (x,t) = B /1 (x,t) for a.e. ( x,t ) £ R d x [0, T]. 


k—yoo 


Similarly, taking into account ((9|), one can apply the Lebesgue’s dominated convergence theorem and get 


lim lim R k = 0, lim limr| = Wl+ / / v 1 • \B^ — B a \ 2 da s ds. 


k—t oo £—^0 


k—too £—^0 


Hence one can pass in E?l) to limits as e —> 0, then k —> oo and get 


Hd-t + J il>da t < C h ([iQ, cr 0 ) 


lIf i 


B^ - B a \ 2 da s ds ■ \ 1 T v 1 / / |-B M — B a \ 2 da s ds. (23) 


Passing to supremum over £ § b h and using step 3, we obtain 


C h (fi t ,crt) < C h (n Ql i t 0 ) + m-Jf /, 1 \B l _ l - B a \ 2 dcr s ds ■ 1 + v 1 J J 


Bfj, - B a \ 2 da s ds, 


i.e. the estimate m with A = 0. 


3 Applications to nonlinear equations 


In this section we focus on the applications of the obtained estimate to the study of the well-posedness of 
the Cauchy problem for the nonlinear FPK equations. 

Given a continuous positive function a on [0, T], r £ (0,T] and a non-negative continuous function V(x) on 
R d with V(x) Too as |x| —> Too, define classes of measures 

Mr, a {V) = {n= (lH)te[0,r] ■ J V{x)dnt < a{t), t £ [0 ,t]}, 

Mr(V) ={n= (k-t)te[ 0 ,r] ■ sup / V{x)dnt < Too}. 

££[ 0 ,r] J 

Throughout the section we assume that a non-degenerate d x d-matrix Q(x,t) satisfying (Al) is fixed. 
Suppose that for each measure /i = /qdf £ Mt{V) a Borel mapping 


BQj,,-,-) = B(n) : R d x [0, T] R d 







is defined. Consider the Cauchy problem for a nonlinear FPK equation 


d t Ht = trac e(Q(x, t)D 2 y t ) - di v(B(y, x, t)n t ), y t \ t =o = Mo- (24) 

Again denote the elements of the diffusion matrix Q{x,t) by q l i(x,t), 1 < i,j < d and the elements of the 
vector drift B(y,x,t) by ld{y,x,t), 1 < j < d. Set 

= q l ° (x, t)dl iX .(j) + b\y, x , t)d Xi <f>, 

where summation over all repeated indices is taken. As earlier, we call the measure y = ytdt, t £ [0, T] a 
solution to EH) . if the identity Q holds with L^instead of L. Introduce the following assumptions on the 
drift: 

(Bl) The drift term B is A-dissipative in x, i.e. for every measure y £ Mt(V) 

(B(y, x , t) - B(n, y, t ), x - y) K d < A||x - y\\ 2 (25) 

for all x, y G R d and all t G [0, T], 

(B2) for all measures y and a from Mt(V) 

B(y, x , t) — \x G L 2 (M. d x [0, T],d(y s + a s )ds). (26) 


We start with the question of uniqueness and stability of the probability solution to (EH) . As earlier, we 
assume that some continuous non-decreasing monotone bounded cost function h with h(0 ) = 0 is fixed. 
Given a non-negative non-decreasing function G, denote 


G* 



du 

G^TT)- 


Corollary 3.1. Fix some non-negative continuous function V[x) on R d with V(x) — > +oo as \x\ — > +00 
such that V G L 1 ( R d ;^o) D A 1 (R d ; cto). Assume that the coefficients of the equation (1241) satisfy (Al), (Bl) 
and (B2) with this V. Moreover, assume that that for each two measures y = (yt)te[o,T] an d (o’t)te[o,T] from 
M t (V) 

\B(y,x,t) - B(a,x,t)\ < y/V{x)G(C h {y t ,a t ) (27) 

for some non-negative increasing function G such that G*(0) = +00. 


Then each two solutions (yt)te[o,T] an d (ot)te[ o,t] of the problem (1241) from the class Mt(V ) with initial data 
yo and a 0 respectively satisfy 

C hxt (jH,<T t ) < ((G*) _1 (^G*(2(C h (yo, era)) 2 ) — ct'j ) 

for all t G [0,T]; here (G*) -1 is the inverse to G* function, and c > 0 is some positive constant. 

Example 3.1. Assumptions (Bl), (B2) and (E71) are fulfilled, for example, for drift terms of the form 

B(y , x , t) = H(x) 1 k{x,y)dy t (y) 

with 0 < H{x) < y/V(x) and a A- dissipative in the first variable kernel k(-, •) such that 


\k(x,y) - k(z, y)\ < h(\x - y\). 





Proof of Corollary 13.11 First of all, if /i is a solution to (124T) and assumptions of Corollary [XU are fulfilled, 
then the linear FPK equation 

dtpt = tvace(Q(x, t)D 2 p t ) - di v(B(p, x, t)p t ), p t \t=o = Po 

has a solution p = p and it satisfies assumptions of Theorem 12. 11 similarly does a. Hence one can apply (USD 
with B^f, •) = B{p, •) and B a {-, •) = B(<j, •)■ 


Next, arguing as on Step 1 of the proof of Theorem 12.II we can assume that the drift term B is dissipative. 
With condition (1271) in hand, the estimate (fTUl) takes the form 


Ch{pt, &t) < C h (po,c r 0 ) + \\h\\ 00 y/i> G 2 (C h (p s ,a s ))ds ■ \J 1 + U 1(1 J G 2 (C h (p s ,cr s ))ds, (28) 

where a = sup tg r 0T ] fV(x)dpt < +oo and v is the ellipticity constant of Q. Note that Chipt^t) < Halloo- 
Then (1251) can be reduced to a weaker inequality 


C h {pt,crt) < C h (po,<r 0 ) + K ^J G 2 (C h (p s ,a s ))ds 


(29) 


with K = UhllooV^ 1 a- \Jl + ■ TG 2 (||/i|| 00 ). Squaring (E^l) and using the inequality (b + c) 2 < 2b 2 + 2c 2 , 

we get 

Ch(pt, crt) 2 < 2Ch(po, cr 0 ) 2 + 2K 2 f G 2 (C h (p s ,a s ))ds. 

Jo 

If po = Co, then uniqueness follows immediately due to explicit integration. In the general case the Gronwall 
type inequality (for example, ( 8 J Th. 27]) implies 

C h (p t ,a t ) < ((G*)- 1 (g*( 2(C h (po, <tq)) 2 ) — 2K 2 tj) 1 ^ 2 . 

□ 


A particular special case G(u) = u of this latter estimate is especially interesting: 
Corollary 3.2. Let p and a be two solutions to (124|) as in Theorem \S.1\ with G(u) 


N > 0 

Ch xt (pt,Vt) < V2Ch(po,cro)e Nt . 
In particular, if the drift is dissipative (X = 0) or A<0, then 


C h (pt,vt) < V2C h {po,er 0 )e Nt . 


u. Then for some 


In some cases the estimate (fTUl) enables to establish existence of a solution to the nonlinear equation (l24l) . To 
show this, consider h(r) = min{|r| p , 1} for some p > 1. Recall that in this case G^ p (pt,at) is a metric and 
turns the space of probability measures into a complete metric space. Moreover, convergence with respect 
to this metric is equivalent to weak convergence (see [6] Th. 1.1.9]). 

Corollary 3.3. Suppose there exists a function V £ C 2 (K d ), V > 1 such that V{x) — > +oo as \x\ — > +oo 
and there exists positive function A on [0, +oo) such that 

{L ll V)(x, t) < A(o(f))(l + R(a;)) 

for each a £ C + [0, T\, t £ [0, T\, each ( x , t) £ M. d x [0, T] and each p £ M T , a (V). Assume that the coefficients 
in (1241) satisfy (Al), (Bl) and (B2) with this function V. Assume that B(a n ) —> B(a) in L 2 (W 1 x[0,T],da s ds) 
as n —> oo if measures a n (dxdt) = af(dx)dt weakly converge to a measure a(dxdt) = at(dx)dt on the strip 
R d x [ 0 , T\. 

Then for every probability measure p* such that V £ L 1 (R d ; p*), there exists a (local) probability solution 
P = (Pt)te[ o,r] to (1241) with initial condition p*. 








Example 3.2. Let k(x,y) be a bounded function, A- dissipative in the first variable for every y £ Let 
Q(x,t ) be a matrix satisfying (Al). Then the Cauchy problem (l24l) with 

B(fJ.,x,t) = J k(x,y)dyi t (y) 

satisfies all assumptions of Theorem 13.31 with V(x) = 1 + \x\ 2 and any probability measure v with finite 
second moment. 

Example 3.3. Let V > 0 be some C 2 -function on with at least linear growth. Let g(x) be a A- dissipative 
function such that |p| < y/V. Let Q(x,t) be a matrix satisfying (Al). Then the Cauchy problem (HH1) with 

B(n,x,t) = g(x) J k(y)dnt(y) 

with some non-negative continuous bounded kernel k(y) satisfies all assumptions of Theorem 13.31 with any 
probability measure v that integrates V. 

Example 3.4. Fix a £ (0,1) and a matrix Q satisfying (Al). Then the Cauchy problem (1M1) with 

B(n, x, t ) = — (|a;|“ _1 a;) * p t 

satisfies all assumptions of Theorem 13.31 with V(x) = 1 + \x\ 2 and any probability measure v with finite 
second moment (cf. [Til Proposition 2.1]). 


Proof. As earlier, without loss of generality, the drift term is dissipative. Let a £ M Ta (V) for some r,a. 
Consider 

d t y t = trac e(Q(x,t)D 2 g t ) ~ di v(B(a, x,t)y, t ), p 0 = M*- 

Note that the dissipativity of the drift ensures that it is bounded localy in (a;, t). Hence under the assumptions 
of the theorem there exists a unique probability solution p = (nt)te[o,T] i n M T (P) (see p(2] Theorem 3.6]). 
Therefore the mapping 0 : M Tt0l (V) —> M T (V) 

y, = 0(a) <==> dtUt = trace(Q(x, t)D 2 y t ) - div(H(cr, x, t)y, t ), po = 

is correctly defined. It is obvious that the solutions to (l24l) are exactly the fixed points of the mapping 0. 


Define subclass N TtOL (V) of the class M T ^ a (V) as follows: 

Nr, a (V) := {p e Mr,a(V) ■ I J <p(x)d{iH - M s )| < K(r,a,ip) ■ \t- s\ £ C 0 °°(M d )}, 

where 

K(r,a,ip) := sup{|L M </?(x, t)\, (x,t) £ K d x [0, r], yi £ M T , a (V)}. 

Obviously N T a is a convex set. By virtue of eh Corollary 4] there exist a(t) > 0 and r £ (0,T] such that 
0(Nf,a(V)) C Nf' & (V). Moreover, the class N f Sl (y) is a compact set in the topology of weak convergence 
of measures on the strip R d x [0, r] by [Til Corollary 1], Let us check that continuity of the mapping 0 on 
Nf,a(V). Suppose that the sequence a n = (<r”) £ Nf } a{Y) weakly converges to a = (at) £ Nf,a(V). Set 
yi n := 0(<t"), p := 0(cr). Due to (1211) we have 


^h(^t i Mt) "S 



| B(a n ) — B(a)\ 2 da s ds ■ 




| B(a n ) — B(a)\ 2 da s ds. 


Our conditions imply that the right-hand side goes to zero as n —> oo. Hence p" converges to with respect 
to the metric C\j v and thus converges weakly. Let us show that [i n converges to p on the strip R d x [0, f]. 
Fix some continuous bounded function (^(x,t). Then for each t £ [0,f] we have 


C (x,t)df% -> 


J C (x,t)dfM, 


n oo. 




Since the measures \Tf are probability measures and £ is bounded, the integrals on the right-hand side 
are uniformly bounded and pointwise (with respect to i £ [0,t]) convergent to J ((x,t)dp.f Therefore 
Lebesgue’s dominated convergence theorem ensures 


J J ((x, t)dnfdt —> J J ((x,t)dp, t dt, n —> oo. 


By definition this means that the sequence fj, n converges weakly to fi on the strip K d x [0, r]. 


Summarizing, we have a continuous mapping 0 on a convex compact set TVf (V) and maps it onto itself. 
The Schauder fixed-point theorem ensures that there exists a fixed point of 0 in Nf,a(V), i.e. there exists 
a solution /i = (^)tg[o,r] to (1241) with initial condition fi*. □ 
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